Goto

Collaborating Authors

 property 2


Markov locality and relating it to p locality

Neural Information Processing Systems

To gain intuition for how p-locality functions, we will introduce another notion of locality, called Markov locality, which will use the language of Markov blankets. We will prove that under relatively relaxed conditions p-locality and Markov locality are equivalent. This will allow us to relate the notion of locality to various graph structures commonly used to represent probability distributions, and will be a key step in proving Properties 2.1 and 2.2. We start by defining the Markov boundary, M(X,S), of a random variable X contained in a set of random variables S, as a minimal set such that p(X|S) = p(X|M(X,S)). The Markov boundary defines a minimal set of variables such that, conditioned on these variables, conditioning on no additional random variables in S changes the probability of X [39]. Similarly, we define the Markov blanket, M(X,S) for X in S as any set of variables such that conditioning on M(X,S), makes X conditionally independent from all other variables [39]. In this way, the Markov boundary is a Markov blanket but not all blankets are boundaries. Markov locality: Given probability distribution p(Z) and function f: RNX+Nฮ˜ RNฮ˜, the update function f(Z) is Markov-local with respect to the distribution p over Z if and only if k: Z โ„ฆs.t. AMarkov boundary can be thought of as the set of variables that'locally' communicate with the parameter ฮ˜k, thus providing a natural measure of locality. Importantly, for Markov-locality to be of use, we would like the Markov boundaries of random variables in the model of interest to be unique.







A Defining Markov locality and relating it to p locality

Neural Information Processing Systems

Markov locality, which will use the language of Markov blankets. Markov blanket but not all blankets are boundaries. A Markov boundary can be thought of as the set of variables that'locally' communicate with the parameter Importantly, for Markov-locality to be of use, we would like the Markov boundaries of random variables in the model of interest to be unique. Assume all quantities are as in A.1, that the conditional independence relationships This proof relies on Lemma A.1, proved below. We wish to prove Eq. 2 Eq.


Formalizing locality for normative synaptic plasticity models Colin Bredenberg

Neural Information Processing Systems

Over the last several decades, computational neuroscience researchers have proposed a variety of "biologically plausible" models of synaptic plasticity that seek to provide normative accounts of a variety of learning processes in the brain--these models aim to explain how modifications of


Mini-Batch Consistent Slot Set Encoder for Scalable Set Encoding Andreis Bruno 1, Jeffrey Ryan Willette

Neural Information Processing Systems

Most existing set encoding algorithms operate under the implicit assumption that all the set elements are accessible, and that there are ample computational and memory resources to load the set into memory during training and inference.


Checklist

Neural Information Processing Systems

For all authors... (a) Do the main claims made in the abstract and introduction accurately reflect the paper's contributions and scope? If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Y es] in supplementary (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? If you used crowdsourcing or conducted research with human subjects... (a) Did you include the full text of instructions given to participants and screenshots, if applicable? [N/A] (b) Did you describe any potential participant risks, with links to Institutional Review Board (IRB) approvals, if applicable? [N/A] (c) Did you include the estimated hourly wage paid to participants and the total amount spent on participant compensation? The goal of this section is to quantify how much (in addition to interpolating the training dataset) our model is able to generalize on the test dataset. This is also useful to compare the performances of our model with those of standard ResNet architectures (which integrate batch normalization and training of the hidden layers).